-
Notifications
You must be signed in to change notification settings - Fork 15.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Autosplitter for big generated java methods #10367
Conversation
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
We will need you to sign the CLA before we can review this change. |
57c342b
to
cbb0e87
Compare
133075f
to
04739e5
Compare
Signed-off-by: Okapist <anonbk@gmail.com>
04739e5
to
ef88e38
Compare
CLA signed |
Just a warning that the best person to review this is out for a bit, so you should expect silence for at least a week |
Back in the office. Reviewing.
…On Sat, Aug 6, 2022 at 6:02 PM Matt Fowles Kulukundis < ***@***.***> wrote:
Just a warning that the best person to review this is out for a bit, so
you should expect silence for at least a week
—
Reply to this email directly, view it on GitHub
<#10367 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AZRRDX4WYMMW63Q6YNMAJULVX34JLANCNFSM55WVVZTA>
.
You are receiving this because your review was requested.Message ID:
***@***.***>
--
Jerry Berg | Software Engineer | ***@***.*** | 720-808-1188
|
@Okapist these are some intriguing results. Thank you for trying this out! Before we go down this path of introducing complexity through splitting and continue to expand the size of generated message code, I'd like us to try comparing the performance using the Lite approach to these methods. See GeneratedMessageLite.hashCode, .equals, .writeTo which use the Schema utility to implement these methods. Would you be willing to take a crack at that for comparison? I have a pressing project that will likely delay me from investigating that approach myself. The goal is to balance performance with the maintenance load from having so many approaches to the problem. |
Yep. Performance check show no difference as expected.
|
@Okapist Thanks for running this! If I understand, these are results for running the other methods with the split in place. Which will be useful for the comparison I really want to run which is using the Lite implementation of these methods vs current impl and split impl. We could then decide if we should introduce splitting or use the Lite implementation. |
Lite implementation is slow. Usually we use serialize and deserialize only. Lite is 20-30% slower than normal. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before we proceed with this implementation, what is the reason for incrementally deciding to split vs just looking at the number of fields up front and branching within the top-level method?
@@ -1120,6 +1122,22 @@ void EscapeUtf16ToString(uint16_t code, std::string* output) { | |||
} | |||
} | |||
|
|||
void MaybeSplitJavaMethod(io::Printer *printer, int* fields_in_function, int* method_num, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we MaybeSplitJavaMethod? Can't we know ahead of time (by the number of fields) how we need to split the top-level method?
@Okapist @googleberg Hi, I would like to know the current status of this PR. Has the ability of autosplitter been integrated? I think this feature is still very important. Big proto will seriously affect the performance of serialization and deserialization in Java. Even if you encounter a proto with too many lines and the generated java method exceeds 64k, it cannot be used at all. |
Fix for #10247
All unit tests ok with very small kMaxFieldsInMethod=4
Round trip benchmark
Check small proto with 15 field. All fields inited
Check small proto with 15 field. 2 fields inited, 14 empty
Check mid size proto. Here split code work and JIT work on old and on new code.
Check big proto. Around 900 fields. Here split code work and JIT not work on old code.
I see no difference or slight increase speed in small or medium proto files and great speed increase in large proto files.